{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 第八章 树"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8.1 树的基本概念"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.1.1 树的定义和属性"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "树是一种非线性数据结构，用树实现的一系列算法比线性数据结构如链表和数组要快得多。\n",
    "\n",
    "树：存储一系列元素的有限节点集合，这些节点有 parent-children 关系并满足以下属性：\n",
    "\n",
    "* 如果树非空，则一定有一个根节点，且根节点没有父节点。\n",
    "* 每个非根节点都有唯一的父节点，每个具有父节点的节点都是父节点的一个孩子。\n",
    "\n",
    "父节点相同的节点称为兄弟节点，没有孩子的节点称为外部节点，有孩子的节点称为内部节点。外部节点也称为叶子节点。\n",
    "\n",
    "计算机的文件系统是树关系，文件夹是内部节点，文件是外部节点，也称为叶子节点。\n",
    "\n",
    "以某个节点为根节点的子树包含所有这个节点的子孙，这个节点是这些子孙的祖先。\n",
    "\n",
    "树的边是一对相连的节点，树的路径使很多个相连的节点，不过需要一直往下。\n",
    "\n",
    "有序树：树中每个节点的孩子节点有特定的顺序。例如结构化的文档，从1到1.X再到1.X.X等，是有序树。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.1.2 树的抽象数据类型"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "树的抽象数据结构支持以下方法：\n",
    "\n",
    "1. p.element(): 返回存储在位置p处的元素\n",
    "2. T.root(): 返回树T根节点的位置。如果树为空，返回None。\n",
    "3. T.is_root(p): 如果位置p是树T的根，则返回True。\n",
    "4. T.parent(p): 返回位置p的父节点的位置，如果p的位置为树的根节点，则返回None。\n",
    "5. T.num_children(p): 返回位置为p的孩子节点的数量。\n",
    "6. T.children(p): 返回位置为p的孩子节点的一个迭代。\n",
    "7. T.is_leaf(p): 位置p是否为叶子节点（是否有孩子节点）。\n",
    "8. len(T): 树T所包含的元素的数量。\n",
    "9. T.is_empty(): 树T是否为空。\n",
    "10. T.positions(): 生成树T所有位置的迭代。\n",
    "11. iter(T): 生成树T存储的所有元素的迭代。\n",
    "\n",
    "如果输入的位置是无效的，返回一个ValueError。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "下面的代码给出了Python中树的一个抽象基类，其他具体的树类可以通过继承这个抽象基类并定义新的方法来实现。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Tree:\n",
    "    \n",
    "    class Position:\n",
    "        \n",
    "        def element(self):\n",
    "            raise NotImplementedError('must be implemented by subclass')           ## 抽象类不直接定义\n",
    "        \n",
    "        def __eq__(self, other):\n",
    "            raise NotImplementedError('must be implemented by subclass')\n",
    "        \n",
    "        def __ne__(self, other):\n",
    "            return not(self == other)                                              ## 调用 == 即调用__eq__\n",
    "        \n",
    "    def root(self):\n",
    "        raise NotImplementedError('must be implemented by subclass')\n",
    "    \n",
    "    def parent(self, p):\n",
    "        raise NotImplementedError('must be implemented by subclass')\n",
    "    \n",
    "    def num_children(self, p):\n",
    "        raise NotImplementedError('must be implemented by subclass')\n",
    "    \n",
    "    def children(self, p):\n",
    "        raise NotImplementedError('must be implemented by subclass')\n",
    "    \n",
    "    def __len__(self):\n",
    "        raise NotImplementedError('must be implemented by subclass')\n",
    "    \n",
    "    def is_root(self, p):\n",
    "        return self.root() == p\n",
    "    \n",
    "    def is_leaf(self, p):\n",
    "        return self.num_children(p) == 0\n",
    "    \n",
    "    def is_empty(self):\n",
    "        return len(self) == 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8.1.3 计算深度和高度"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "节点p的深度是节点p的祖先的个数，不包括p本身。根节点的深度为0。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "深度的递归定义如下：\n",
    "\n",
    "1. 如果是根节点，则深度为0.\n",
    "2. 如果不是根节点，则深度为其父节点的深度+1。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def depth(self, p):\n",
    "    if self.is_root(p):\n",
    "        return 0\n",
    "    else:\n",
    "        return 1 + self.depth(self.parent(p))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "求深度的算法在最坏的情况下时间复杂度为O(n)，n为树中节点的总个数，这种情况出现于，节点位于树的最低端，且所有树的节点只构成一条路径，此时深度为n-1。即如果n个节点构成一条路径，则路径最低端深度为n-1。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "节点p的高度：\n",
    "\n",
    "1. 如果是叶子节点，则高度为0。\n",
    "2. 如果不是叶子节点，则为孩子的最大高度+1。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "深度最大的节点是最下面的叶子节点，高度最大的节点是根节点。深度和高度的最大值相等。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "非空树的高度是树根节点的高度，也是所有节点高度的最大值，还是所有节点深度的最大值。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "def _height1(self):                                           ## 树的高度\n",
    "    return max((self.depth(p) for p in iter(self)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这种算法在最坏情况下的运行时间为$O(n^2)$。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "def _height2(self, p):\n",
    "    if self.is_leaf(p):\n",
    "        return 0\n",
    "    else:\n",
    "        return 1 + max(self._height2(c) for c in self.children(p))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这种算法的时间复杂度为$O(n)$。因为生成子节点，共需要n-1次操作，而递归调用次数为n次，非递归部分的运行时间为$O(1)$，总的时间复杂度为$O(n)$。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在抽象类中，用户可能希望不需要输入根节点就得出整个树的高度，因此再次对求高度的方法进行封装："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "def height(self, p=None):\n",
    "    if p == None:\n",
    "        p = self.root()\n",
    "    return self._height2(p)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8.2 二叉树"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "二叉树是具有以下属性的`有序树`：\n",
    "\n",
    "1. 每个节点最多有两个孩子节点\n",
    "2. 每个子节点被命名为左孩子和右孩子\n",
    "3. 对于每个节点的孩子节点，在顺序上，左孩子先于右孩子"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "以左孩子和右孩子为根节点的子树分别称为左子树和右子树。\n",
    "\n",
    "完全二叉树：每个节点要么没有子节点（即该节点为叶子节点、外部节点），要么有两个子节点，则该二叉树称为完全二叉树或者满二叉树，反之称为不完全二叉树。\n",
    "\n",
    "二叉树可以用于决策——yes or not，回答每个问题之后获得决策，即决策树；二叉树可以表示算术，叶子是数字，内部节点是运算。这两者都是完全二叉树，一个有两个选项，一个是因为加减乘除都需要两个数。、\n",
    "\n",
    "递归方式定义二叉树：左子树和右子树一般情况下都为二叉树。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.2.1 二叉树的抽象数据类型"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "二叉树支持3种额外的访问方法：\n",
    "\n",
    "1. T.left(p): 返回p左子节点的位置，若没有左子节点，返回None\n",
    "2. T.right(p): 同理\n",
    "3. T.sibling(p): 返回p兄弟节点的位置，若没有兄弟节点，返回None（左子节点和右子节点互为兄弟节点）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "继承原来的Tree抽象基类（同时也继承了Position类作为嵌套类），编写BinaryTree抽象基类，仍然保持抽象性，因为还没有到需要完善内部细节的时候。新的BinaryTree抽象基类声明了新的抽象方法left和right，不过仍然是在之后BinaryTree的子类处再定义。不过只有底层方法未编写，对于只需要调用其他方法的方法，就直接编写。BinaryTree重选ing基类实现了sibling方法，因为可以用lefr、right和parent方法实现，还实现了children方法（覆盖了之前Tree抽象基类的children方法），也是通过调用其他方法。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "class BinaryTree(Tree):\n",
    "    \n",
    "    def left(self, p):\n",
    "        NotImplementedError('must be implemented by subclass')\n",
    "    \n",
    "    def right(self, p):\n",
    "        NotImplementedError('must be implemented by subclass')\n",
    "        \n",
    "    def sibling(self, p):\n",
    "        parent = self.parent(p)\n",
    "        if parent is None:\n",
    "            return None\n",
    "        else:\n",
    "            if self.left(parent) == p:\n",
    "                return self.right(parent)\n",
    "            else:\n",
    "                return self.left(parent)\n",
    "    \n",
    "    def children(self, p):\n",
    "        if self.left(p) is not None:\n",
    "            yield self.left(p)\n",
    "        if self.right(p) is not None:\n",
    "            yield self.right(p)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.2.2 二叉树的属性"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "将深度为d的所有节点称为树T的d层，则d层最多只有$2^d$个节点。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "非空完全二叉树的外部节点比内部节点多一个。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8.3 树的实现"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "之前的Tree类和BinaryTree类都只是抽象基类，没有定义树的存储方法，对于具体的访问方法也没有实现。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.3.1 二叉树的链式存储结构"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "模仿链表的结构，可以让每个节点都存储自身元素的引用、指向父节点和左右子节点的引用，若没有，则指向None。树还维护一个指向根节点的引用和一个size变量用于表示T的所有节点数量。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "class LinkedBinaryTree(BinaryTree):\n",
    "    \n",
    "    class _Node:                                                    ## 二叉树节点，存储元素，左右节点和父节点\n",
    "        \n",
    "        __slots__ = '_element', '_parent', '_left', '_right'\n",
    "        \n",
    "        def __init__(self, element, parent=None, left=None, right=None):\n",
    "            self._element = element\n",
    "            self._parent = parent\n",
    "            self._left = left\n",
    "            self._right = right\n",
    "    \n",
    "    class Position(BinaryTree.Position):                            ## 继承时，方法可以自动继承和更新，但是如果要更新嵌套类，需要手动继承\n",
    "        \n",
    "        def __init__(self, container, node):                        ## 引用的节点以及所属的树\n",
    "            self._container = container\n",
    "            self._node = node\n",
    "        \n",
    "        def element(self):\n",
    "            return self._node._element\n",
    "        \n",
    "        def __eq__(self, other):                                    ## 嵌套类的继承，之前已经定义好了__ne__和__eq__的关系，这里只定义一个\n",
    "            return type(self) is type(other) and other._node is self._node    ## 两个Position引用同一个_Node视为相等\n",
    "        \n",
    "    def _validate(self, p):                                         ## 判断是否为当前树的位置，如果是返回位置存储的节点，不是则报错\n",
    "        if not isinstance(p, self.Position):\n",
    "            raise TypeError('p must be proper Position style')\n",
    "        if p._container is not self:\n",
    "            raise ValueError('p does not belong to this container')\n",
    "        if p._node._parent is p._node:                              ## 无效节点的父节点会定义为自身\n",
    "            raise ValueError('p is no longer valid')         \n",
    "        return p._node\n",
    "\n",
    "    def _make_position(self, node):                                ## 封装节点为位置\n",
    "        return self.Position(self, node) if node is not None else None\n",
    "        \n",
    "    def __init__(self):                                             ## 初始化为空树\n",
    "        self._root = None\n",
    "        self._size = 0\n",
    "    \n",
    "    def __len__(self):\n",
    "        return self._size\n",
    "    \n",
    "    def root(self):\n",
    "        return self._root\n",
    "    \n",
    "    def parent(self, p):\n",
    "        node = self._validate(p)\n",
    "        return self._make_position(node._parent)\n",
    "    \n",
    "    def left(self, p):\n",
    "        node = self._validate(p)\n",
    "        return self._make_position(node._left)\n",
    "    \n",
    "    def right(self, p):\n",
    "        node = self._validate(p)\n",
    "        return self._make_position(node._right)\n",
    "    \n",
    "    def num_children(self, p):                                      ## 抽象基类Tree中的num_children没有具体实现，因为不同树的实现是不同的\n",
    "        node = self._validate(p)                                    ## 实现了num_children也就实现了is_leaf\n",
    "        count = 0\n",
    "        if node._left is not None:\n",
    "            count += 1\n",
    "        if node._right is not None:\n",
    "            count += 1\n",
    "        return count"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`is`和`==`的区别：`is`判断两个对象是否指向同一块内存，`==`判断两个对象是否相等，一般调用类自身定义的`__eq__()`方法。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "更新方法不在抽象基类中给出的原因是不同的具体实现定义同样的方法，可能因为效率问题而有所不同，而且抽象基类提供的更新方法不一定都适用于子类，比如二叉树，就不能有左右子节点之外的子节点。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "链式二叉树定义的更新方法：\n",
    "\n",
    "1. T.add_root(e): 创建存储e的根节点，返回根节点的位置，若树非空则报错。\n",
    "2. T.add_left(p, e): 将存储e的新节点作为p的左子节点并返回新节点的位置，若p已有左子节点则报错。\n",
    "3. T.add_right(p, e): 同上理。\n",
    "4. T.replace(p, e): 用存储e的新节点代替p位置存储的节点并返回之前存储的元素。\n",
    "5. T.delete(p): 删除p位置存储的节点，并用子节点代替自身，并返回p位置存储的元素，若不只一个子节点则报错。\n",
    "6. T.attach(p, T1, T2): 将T1，T2分别链接为位置p的节点的左右子树并将T1，T2重置为空树，若p不是叶子节点则报错。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "为了避免不必要的更新方法被LinkedBinaryTree的子类所继承，所有的方法都设为非公有。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "class LinkedBinaryTree(BinaryTree):\n",
    "    \n",
    "    class _Node:                                                    ## 二叉树节点，存储元素，左右节点和父节点\n",
    "        \n",
    "        __slots__ = '_element', '_parent', '_left', '_right'\n",
    "        \n",
    "        def __init__(self, element, parent=None, left=None, right=None):\n",
    "            self._element = element\n",
    "            self._parent = parent\n",
    "            self._left = left\n",
    "            self._right = right\n",
    "    \n",
    "    class Position(BinaryTree.Position):                            ## 继承时，方法可以自动继承和更新，但是如果要更新嵌套类，需要手动继承\n",
    "        \n",
    "        def __init__(self, container, node):                        ## 引用的节点以及所属的树\n",
    "            self._container = container\n",
    "            self._node = node\n",
    "        \n",
    "        def element(self):\n",
    "            return self._node._element\n",
    "        \n",
    "        def __eq__(self, other):                                    ## 嵌套类的继承，之前已经定义好了__ne__和__eq__的关系，这里只定义一个\n",
    "            return type(self) is type(other) and other._node is self._node    ## 两个Position引用同一个_Node视为相等\n",
    "        \n",
    "    def _validate(self, p):                                         ## 判断是否为当前树的位置，如果是返回位置存储的节点，不是则报错\n",
    "        if not isinstance(p, self.Position):\n",
    "            raise TypeError('p must be proper Position style')\n",
    "        if p._container is not self:\n",
    "            raise ValueError('p does not belong to this container')\n",
    "        if p._node._parent is p._node:                              ## 无效节点的父节点会定义为自身\n",
    "            raise ValueError('p is no longer valid')         \n",
    "        return p._node\n",
    "\n",
    "    def _make_position(self, node):                                ## 封装节点为位置\n",
    "        return self.Position(self, node) if node is not None else None\n",
    "        \n",
    "    def __init__(self):                                             ## 初始化为空树\n",
    "        self._root = None\n",
    "        self._size = 0\n",
    "    \n",
    "    def __len__(self):\n",
    "        return self._size\n",
    "    \n",
    "    def root(self):\n",
    "        return self._root\n",
    "    \n",
    "    def parent(self, p):\n",
    "        node = self._validate(p)\n",
    "        return self._make_position(node._parent)\n",
    "    \n",
    "    def left(self, p):\n",
    "        node = self._validate(p)\n",
    "        return self._make_position(node._left)\n",
    "    \n",
    "    def right(self, p):\n",
    "        node = self._validate(p)\n",
    "        return self._make_position(node._right)\n",
    "    \n",
    "    def num_children(self, p):                                      ## 抽象基类Tree中的num_children没有具体实现，因为不同树的实现是不同的\n",
    "        node = self._validate(p)                                    ## 实现了num_children也就实现了is_leaf\n",
    "        count = 0\n",
    "        if node._left is not None:\n",
    "            count += 1\n",
    "        if node._right is not None:\n",
    "            count += 1\n",
    "        return count\n",
    "    \n",
    "    def _add_root(self, e):\n",
    "        if self._root is not None:\n",
    "            raise ValueError('Root exists')\n",
    "        self._size = 1\n",
    "        self._root = self._Node(e)\n",
    "        return self._make_position(self._root)\n",
    "    \n",
    "    def _add_left(self, p, e):\n",
    "        node = self._validate(p)\n",
    "        if node._left is not None:\n",
    "            raise ValueError('Left child exists')\n",
    "        self._size += 1    \n",
    "        node._left = self.Node(e)\n",
    "        return self._make_position(node._left)\n",
    "    \n",
    "    def _add_right(self, p, e):\n",
    "        node = self._validate(p)\n",
    "        if node._right is not None:\n",
    "            raise ValueError('Right child exists')\n",
    "        self._size += 1\n",
    "        node._right = self._Node(e)\n",
    "        return self._make_position(node._right)\n",
    "    \n",
    "    def _replace(self, p, e):\n",
    "        node = self._validate(p)\n",
    "        old = node._element\n",
    "        node._element = e\n",
    "        return old\n",
    "    \n",
    "    def _delete(self, p):\n",
    "        node = self._validate(p)\n",
    "        if self.num_children(p) == 2:\n",
    "            raise ValueError('p have two children')\n",
    "        child = node._left if node._left else node._right\n",
    "        if child is not None:\n",
    "            child._parent = node._parent\n",
    "        if node is self._root:\n",
    "            self._root = child\n",
    "        else:\n",
    "            parent = node._parent\n",
    "            if parent._left is node:\n",
    "                parent._left = child\n",
    "            else:\n",
    "                parent._right = child\n",
    "            self._size -= 1\n",
    "        node._parent = node\n",
    "        return node._element\n",
    "    \n",
    "    def _attach(self, p, t1, t2):\n",
    "        node = self._validate(p)\n",
    "        if not self.is_leaf(p):\n",
    "            raise ValueError('position must be leaf')\n",
    "        if not type(self) is type(t1) is type(t2):\n",
    "            raise TypeError('Tree must be match')\n",
    "        self._size += len(t1) + len(t2)\n",
    "        if not t1.is_empty():\n",
    "            t1._root._parent = node                               ## 在t1和self连接之后就可以将t1重置为空了\n",
    "            node._left = t1._root\n",
    "            t1._root = None\n",
    "            t1._size = 0\n",
    "        if not t2.is_empty():\n",
    "            t2._root._parent = node\n",
    "            node._right = t2._root\n",
    "            t2._root = None\n",
    "            t2._size = 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "链式二叉树所有方法的时间复杂度都为O(1): len, is_empty, root, parent, left, right, sibling, children, num_children, is_root, is_leaf, add_root, add_left, add_right, replace, delete, attach。depth(p)为O(d+1)，height为O(n)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.3.2 基于数组表示的二叉树"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "给二叉树的每个节点编号，编号作为索引将所有节点的位置存储在数组中。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "编号的规则为：根节点编号为零，左子节点的编号为父节点编号乘2加1，右子节点的编号为父节点编号乘2加2。编号不一定连续。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "用数组表示二叉树的优势是，能用索引访问位置p，且root、parent、left、right等方法能通过索引的简单运算实现。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "表示二叉树的数组可能有空元素，因为节点的编号未必连续，对于极端的二叉树，只有n个元素却需要$2^n-1$的存储空间，这种指数级存储空间是不被允许的。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "用数组表示二叉树的缺点不仅仅是可能造成空间上的浪费，还有删除节点的效率很低，因为删除一个节点，其子树节点的编号都需要改变，即位置都需要移动。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.3.3 一般树的链式存储结构"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "与二叉树的链式存储结构的区别在于不再是维护left和right指针，而是维护一个容器（如一个Python list），容器中是所有子节点。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8.4 树的遍历算法"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "先序遍历：父节点先于所有子节点，即先遍历父节点，再遍历子节点，用递归实现，具体顺序为父节点——>左子树——>右子树（只是以二叉树为例，一般树可以根据子节点的顺序来遍历子树），其中左子树中所有节点先于右子树。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "后序遍历：子节点先于父节点，先遍历子节点，才能遍历父节点，用递归实现，具体表现为左子树——>右子树——>父节点，其中最先遍历的是左子树的叶子节点，然后是两个叶子节点的父节点，这样就是最先遍历的子树，然后遍历旁边的子树，接着遍历这些子树的父节点，这样就遍历了一颗更大的子树，这里面最终的就是递归的思想。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "先序遍历和后序遍历的时间复杂度为O(n)，n为节点数量，这是遍历算法的最佳运行时间，因为必须遍历n个节点，至少是O(n)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "广度优先遍历：按深度遍历，深度小的先于深度大的，即一层一层遍历。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在树的不同应用下应该采用不同的遍历顺序，比如井字棋，需要先判断第一步，再判断第二步，每一步的几种情况都是节点，那么应该采用的是广度优先遍历。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "中序遍历：一种二叉树独有的遍历方法，遍历顺序为左子树——>父节点——>右子树，也是用递归实现。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "二叉搜索树：每个节点存储一个数，其中比父节点小的数存储在左子树，比父节点大的数存储在右子树，对二叉搜索树采取中序遍历，可以实现从小到大的遍历，"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以从一个集合S构建一棵二叉搜索树T，从集合S中查找元素v可以转化为从二叉搜索树中查找元素v，从根节点开始，每次比较v和节点存储元素的大小，若v大，搜索右子树，若v小搜索左子树，若有节点元素与v相等，则查找成功，若到达空子树（叶子节点的子树为None），则查找不到。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "二叉搜索树的运行时间与高度有关，高度越小，运行时间越快（比较的次数少），n个节点的二叉搜索树的高度可以很小也可以很大。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.4.4 用Python实现树遍历"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. iter(T): 生成所有元素的迭代器，可以基于T.positions()产生的所有位置的迭代器。\n",
    "2. T.positions(): 产生所有位置的迭代器。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "def __iter__(self):\n",
    "    for p in self.positions():\n",
    "        yield p.element()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "T.positions()可以基于前面所述几种遍历算法。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "先序迭代器：递归算法需要输入位置，然而我们定义的位置迭代器理应是不需要输入参数的，因此先定义一个非公开的生成器，输入根节点可以递归产生先序迭代器，再定义一个公开的方法进行封装。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里要注意生成器的递归与传统函数的递归略有不同，，生成器的递归在递归的时候需要调用for循环。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "def preorder(self):\n",
    "    if not self.is_empty():\n",
    "        for p in self._subtree_preorder(self.root()):\n",
    "            yield p\n",
    "\n",
    "def _subtree_preorder(self):\n",
    "    yield p\n",
    "    for c in self.children(p):\n",
    "        for other in self._subtree_preorder(c):\n",
    "            yield other"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在Tree类中，可以这样将先序遍历设为默认遍历方式："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "def positions(self):\n",
    "    return self.preorder()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "后序迭代器："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "def postorder(self):\n",
    "    if not self.is_empty():\n",
    "        for p in self._subtree._postorder():\n",
    "            yield p\n",
    "\n",
    "def _subtree_postorder(self, p):\n",
    "    for c in self.children(p):\n",
    "        for other in self._subtree_postorder(c):\n",
    "            yield other\n",
    "    yield p"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "广度优先遍历迭代器：只有这个迭代器不是用递归编写的，对于广度优先的遍历顺序，是采用队列来实现的。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "def breadthfirst(self):\n",
    "    if not self.is_empty():\n",
    "        fringe = LinkedQueue()\n",
    "        fringe.enqueue(self.root())\n",
    "        while not fringe.is_empty():\n",
    "            p = fringe.dequeue()\n",
    "            yield p\n",
    "            for c in self.children(p):\n",
    "                fringe.enqueue(c)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "中序遍历迭代器（二叉树）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "def inorder(self):\n",
    "    if not self.is_empty():\n",
    "        for p in self._subtree_inorder(self.root()):\n",
    "            yield p\n",
    "\n",
    "def _subtree_inorder(self, p):\n",
    "    if self.left(p) is not None:\n",
    "        for other in self._subtree_inorder(self.left(p)):\n",
    "            yield other\n",
    "    yield p\n",
    "    if self.right(p) is not None:\n",
    "        for other in self._subtree_inorder(self.right(p)):\n",
    "            yield other"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "将中序遍历作为二叉树遍历的默认遍历"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "def positions(self):\n",
    "    return self.inorder()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.4.5 树遍历的应用"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "目录表和计算机文件系统是树先序遍历的应用之一。对于没有缩进的目录表，直接通过positions迭代器，打印element即可，默认调用先序遍历，前提是将目录存储在树中。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.4.5 树遍历的应用 —— 定义一些函数"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "生成目录表使用先序遍历即可，缩进目录表的缩进可以根据深度来定，但是如果每次都求一次深度，时间复杂度会达到$O(n^2)$，导致效率低下。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "要生成缩进，深度是必须的，但是深度的浪费在于，没有利用父节点的深度比子节点小1这个规律，对于父子节点，都采取遍历的方式求深度，造成效率低下。对于这种情况，可以选择在递归时输入参数，将求深度包含在打印的递归中一起进行，可以节省时间。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "def preorder_indent(T, p, d):\n",
    "    print(2 * d * ' ' + str(p.element()))\n",
    "    for c in T.children(p):\n",
    "        preorder_indent(T, c, d + 1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在缩进目录的基础上，如何加上数字编号：将路径作为一个额外的参数加入递归中。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "def preorder_label(T, p, d, path):\n",
    "    label = '.'.join(str(j + 1) for j in path)\n",
    "    print(2 * d * ' ' + label, p.element())\n",
    "    path.append(0)\n",
    "    for c in T.children(p):\n",
    "        preorder_label(T, c, d + 1, path)\n",
    "        path[-1] += 1\n",
    "    path.pop()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "树的括号表示：子节点在父节点的括号中，兄弟节点之间用逗号分隔。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "def parenthesize(T, p):\n",
    "    print(p.element(), end='')               ## end默认设置为换行符，这样多个print会打印多行\n",
    "    if not T.leaf():\n",
    "        first_time = True\n",
    "        for c in T.children(p):\n",
    "            sep = '(' if first_time else ', '\n",
    "            print(sep, end = '')\n",
    "            first_time = False\n",
    "            parenthesize(T, c)\n",
    "        print(')', end='')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "计算磁盘空间：递归即可，树的应用在于从父节点获得子节点以输入递归函数。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 8.4.6 欧拉图和模板方法模式"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "好像很难的样子，先跳过，以后再说。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8.5 表达式树"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "表达式树：表示算术表达式的二叉树。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "每个内部节点必须存储一个用于定义二进制操作的字符串，每个叶子节点存储一个数值。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "基于二叉树ADT的定义的方法，实现将表达式树打印成表达式，表达式树是二叉树的一种应用，支持独有的方法。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "两种初始化的方式：\n",
    "\n",
    "1. ExpressionTree(value): 创建一棵在根处存储给定值的树。\n",
    "2. ExpressionTree(op, E1, E2): 创建一棵在根处存储字符串op（比如加号）的树，E1和E2分别作为ExpressionTree的实例，成为左右子树。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "ExpressionTree类将继承LinkedBinaryTree类，使用非公开方法_add_root()和_attach()进行初始化。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "表达式树print成算术表达式形式——使用中序遍历，结合先序遍历和后序遍历编写`__str__()`方法。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "class ExpressionTree(LinkedBinaryTree):\n",
    "    \n",
    "    def __init__(self, token, left=None, right=None):\n",
    "        super().__init__()                                 ## 调用父类的初始化方法\n",
    "        if not isinstance(token, str):\n",
    "            raise TypeError('token must be a string')\n",
    "        self._add_root(token)\n",
    "        if left is not None:\n",
    "            if token not in '+-*x/':\n",
    "                raise ValueError('token must be valid operator')\n",
    "            self._attach(self.root(), left, right)\n",
    "    \n",
    "    def __str__(self):\n",
    "        pieces = []\n",
    "        self._parenthesize_recur(self.root(), pieces)\n",
    "        return ''.join(pieces)\n",
    "    \n",
    "    def _parenthesize_recur(self, p, result):\n",
    "        if self.is_leaf(p):\n",
    "            result.append(str(p.element()))\n",
    "        else:\n",
    "            result.append('(')\n",
    "            self._parenthesize_recur(self.left(p), result)\n",
    "            result.append(p.element())\n",
    "            self._parenthesize_recur(self.right(p), result)\n",
    "            result.append(')')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "计算表达式的值："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "def evaluate(self):\n",
    "    return self._evaluate_recur(self.root())\n",
    "\n",
    "def _evaluate_recur(self, p):\n",
    "    if self.is_leaf(p):\n",
    "        return float(p.element())\n",
    "    else:\n",
    "        op = p.element()\n",
    "        left_val = self._evaluate_recur\n",
    "        right_val = self._evaluate_recur\n",
    "        if op == '+':\n",
    "            return left_val + right_val\n",
    "        elif op == '-':\n",
    "            return left_val + - right_val\n",
    "        elif op == '/':\n",
    "            return left_val / right_val\n",
    "        else:\n",
    "            return left_val * right_val"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "创建一棵表达式树：之前提供了在两个表达式树的基础上创建另一个表达式树，现在需要实现给定算术表达式的字符串，能够创建一棵表达式树。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果没有表达式树，直接给定字符串进行计算也是有困难的。一旦能够将算术表达式字符串转化为表达式树，其实也就意味着明晰了表达式的运算顺序等，也就意味着能通过算术表达式的字符串计算出结果。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "def build_expression_tree(tokens):\n",
    "    S = []\n",
    "    for t in tokens:\n",
    "        if t in '+-x/':\n",
    "            S.append(t)\n",
    "        elif t not in '()':\n",
    "            S.append(ExpressionTree(t))\n",
    "        elif t == ')':\n",
    "            right = S.pop()\n",
    "            op = S.pop()\n",
    "            left = S.pop()\n",
    "            S.append(ExpressionTree(op, left, right))\n",
    "    return S.pop()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8.6 练习"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "R-8.5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "def leaf_num(self, p):\n",
    "    if self.is_leaf(p):\n",
    "        return 1\n",
    "    elif p is None:\n",
    "        return 0\n",
    "    else:\n",
    "        return self.leaf_num(self.left(p)) + self.leaf_num(self.right(p))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "R-8.10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "def num_children(self, p):\n",
    "    return (self.left(p) is not None) + (self.right(p) is not None)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "R-8.15"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "class MutableLinkedBinaryTree(LinkedBinaryTree):\n",
    "    \n",
    "    def add_root(self, e):\n",
    "        return self._add_root(e)\n",
    "    \n",
    "    def add_left(self, p, e):\n",
    "        return self._add_left(p, e)\n",
    "    \n",
    "    def add_right(self, p, e):\n",
    "        return self._add_right(p, e)\n",
    "    \n",
    "    def replace(self, p, e):\n",
    "        return self._replace(p, e)\n",
    "    \n",
    "    def delete(self, p):\n",
    "        return self._delete(p)\n",
    "    \n",
    "    def attach(self, p, t1, t2):\n",
    "        return self._attach(p, t1, t2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "C-8.31"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "树所有内部节点的深度之和称为内部路径长度，所有外部节点的深度之和称为外部路径长度。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.2"
  },
  "toc": {
   "base_numbering": "1",
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": true,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "47.2px",
    "left": "495px",
    "top": "185px",
    "width": "276.6px"
   },
   "toc_section_display": true,
   "toc_window_display": false
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
